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Abstract 


In  this  paper  we  describe  the  Nelder-Mead  simplex  method  for  obtaining  the 
minimizer  of  a  function.  The  Nelder-Mead  algorithm  has  several  properties  that 
make  it  a  natural  choice  for  implementation  and  utilization  on  microcomputers. 
Stopping  criteria  for  the  method  are  presented  as  well  as  a  brief  discussion  of  the 
convergence  properties  of  the  method.  An  algorithmic  statement  of  the  method 
is  included  as  an  appendix. 


1.  Introduction.  We  consider  the  problem 

minimize  fix)  fi  i\ 

z£]Rn  ' 

where  /:lRn — ►IR1  and  the  problem  is  to  be  solved  on  a  microcomputer.  The 
fact  that  a  microcomputer  is  being  used  and  that  problem  (1.1)  is  solvable 
on  this  microcomputer  leads  us  to  make  several  assumptions  about  the 
problem  and  the  solution  environment.  First,  we  assume  the  amount  of 
storage  is  small  and,  therefore,  the  number  of  variables,  i.e.  n,  is  also  small. 
Additionally,  we  assume  that  computing  derivatives  of  the  function  is  not 
feasible. 

There  are  a  class  of  methods,  called  direct  search  methods,  see 
Swann  [6]  or  Brent  [l],  that  attempt  to  solve  problem  (1.1)  using  only 
function  value  information.  One  particular  direct  search  method  that  is 
used  quite  frequently  is  the  Nelder-Mead  simplex  method  presented  by 
Nelder  and  Mead  [3].  Additional  references  and  several  modifications  of  the 
algorithm  are  discussed  in  Parkinson  and  Hutchinson  [5]  and  Olsson  and 
Nelson  [4].  The  original  method  of  Nelder  and  Mead  is  best-suited  for  our 
purposes. 

The  properties  of  the  Nelder-Mead  algorithm  that  make  it  appropriate 
for  our  problem  and  environment  are  its  robustness,  its  simplicity  in 
programming  and  its  low  overhead  in  storage  and  computation.  We  say  the 
algorithm  is  robust  because  it  is  very  tolerant  of  noise  in  the  function 
values.  Therefore,  the  function  need  not  be  computed  exactly  and  it  may 
be  possible  to  obtain  an  approximate  function  value  using  many  fewer 
floating  point  computations. 

As  we  shall  see  in  the  following  section,  the  algorithm  is  very  simple  to 
program.  Trial  points  are  obtained  using  very  simple  algebraic 
manipulations  and  these  points  are  accepted  or  rejected  based  only  on  their 
function  values.  Also,  when  the  number  of  variables  is  small,  this  algorithm 
is  often  competitive  with  much  more  complex  algorithms  that  require  a 
great  deal  of  overhead  in  storage  and  algebraic  manipulations.  The  low 
overhead  and  basic  simplicity  of  this  algorithm  make  it  a  natural  choice  for 
use  on  microcomputers.  An  algorithmic  specification  of  the  method  is  given 
in  the  Appendix. 


2.  Algorithm.  At  each  iteration  of  the  Nelder-Mead  simplex  algorithm, 
n-fl  points,  denoted  by  xx,  x2,  •  •  •  ,  xn+1,  are  used  to  compute  trial  steps. 
We  will  often  refer  to  certain  of  these  points  based  on  the  order  induced  by 
their  function  values,  that  is,  at  the  kth  iteration  we  have 
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Xj,  x2,  •  •  •  ,  xB+1,  with  /(x1)</(x2)<  •  •  •  </(xB+1).  A  trial  step  is 
accepted  or  rejected  based  on  the  function  value  of  the  trial  point  and  the 
three  function  values  /  (x^,  /  (xn),  and  /  (xn+1). 

The  n+1  points  used  at  an  iteration  may  be  thought  of  as  the  vertices 
of  an  n-dimensional  simplex.  In  IR2,  for  example,  three  points  determine  a 
triangle.  We  denote  a  simplex  Sk,  with  vertices  xit  x2,  '  •  •  ,  xn+l,  by 
Sk  =  <#!,  x2,  •  •  •  ,  £n+1>.  It  is  often  the  case  that  a  specific  vertex  of  a 
specific  simplex  is  referenced.  Thus,  the  notation  x*  is  used  to  indicate  the 
vertex  of  simplex  Sk  that  has  the  lowest  function  value.  In  Figure  2.1 
below,  if  f  {xx)  =  10.0,  f  (x2)  =  7.0,  and  /  (ar3)  =  3.0,  then  we  would  have 
x1« — x3,  x2« — x2,  and  x3« — ar1. 

Trial  steps  are  generated  by  the  operations  of  reflection,  expansion, 
contraction,  and  shrinkage.  A  reflected  vertex  is  computed  by  reflecting  the 
worst  vertex,  xn+1,  through  the  centroid  of  the  remaining  vertices.  Nelder 
and  Mead  compute  the  reflected  vertex  as 


xr  =  (l+a)x  -  axB+1,  (2.1) 

where  a  =  l,  and  x  is  the  centroid  defined  by 

—  1  n 
x  =  —  £x,-. 

i 

The  reflected  vertex  is  accepted  if  /(x1)</(xr)</(xre),  and  the  next 
iteration  begins  with  the  simplex  defined  by  <X!,  x2,  •  •  •  ,  x„,  xr>.  Note 
that  xr  has  not  been  ordered  with  respect  to  the  other  vertices. 

If  the  reflected  vertex  has  a  lower  function  value  than  xl5  i.e., 
f  (xr)  </(x1),  then  the  trial  step  has  produced  a  good  point  and  the  step  is 
expanded.  The  expansion  vertex  is  computed  as 


xe  =  lxr  +  (l-7)z,  (2.2) 

where  '7  =  2.  The  expansion  vertex  is  accepted  if  /  (xe )  <  /  (xj),  otherwise 
the  reflected  vertex  is  accepted.  Thus,  if  /(ir)</(xj,  then  either  the 
reflected  or  expanded  vertex  is  accepted  and  the  next  iteration  begins. 


FIGURE  2.1 
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If  the  reflected  vertex  is  not  a  better  point  than  xn,  i.e.,  /  (xn)</(ir), 
then  a  contraction  step  is  computed.  If  the  worst  vertex  is  at  least  as  good 
as  the  reflected  vertex,  i.e.,  /(xn+1)</(xr),  then  the  internal  contraction 
vertex  is  computed  as 

xc  =  /?xn  + 1  +  (!—/?)  X,  (2.3) 

otherwise,  the  external  contraction  vertex  is  computed  as 

xc  =  13 xr  +  {1-0)  x,  (2.4) 


where  0  —  .  The  contraction  vertex  is  accepted  if  it  has  a  lower  function 

value  than  xB. 


If  both  the  reflection  vertex  and  the  contraction  vertex  are  rejected, 
then  the  simplex  is  shrunk.  The  shrinkage  operation  is  performed  by 
replacing  each  vertex  x,-,  except  xx,  by  the  point  halfway  between  x,  and 
xv  This  may  be  written  as 


*.■ 


(x,+xi) 

2 


(2.5) 


Finally,  the  values  /(a:,)  are  computed  and  sorted  along  with  /(xj).  This 
order  determines  the  simplex  <xt,  x2,  •  •  •  ,  xn+1>  with  which  the  next 
iteration  commences. 


If  one  envisions  the  simplex  sitting  on  the  surface  defined  by  the 
function,  then  the  operations  of  the  Nelder-Mead  algorithm  can  be  thought 
of  as  the  simplex  tumbling  down  the  surface.  When  the  simplex  has 
reached  a  point  where  further  tumbling  is  not  possible,  the  simplex 
contracts,  or  shrinks  towards  its  lowest  point,  and  the  tumbling  continues. 
Figure  2.2  below,  illustrates  the  various  trial  points  for  the  2-dimensional 
simplex  <x1}  x2,  x3>. 
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There  are  several  stopping  criteria  that  have  been  proposed  for  this 
algorithm.  Nelder  and  Mead  suggest  halting  the  algorithm  when  the 
standard  error  of  the  function  values  falls  below  some  threshold  value. 
That  is,  the  algorithm  is  halted  when  the  following  condition  holds: 

~  £  (/(*•)  -  Tf  <  ei>  (2.6) 

n  i'=l 


where  J  is  the  average  of  the  function  values  and  CjX)  is  some  preset 
value.  Parkinson  and  Hutchinson  [5]  propose  a  stopping  criterion  based  on 
how  far  the  simplex  moves  at  an  iteration.  They  suggest  halting  the 
algorithm  when  the  following  condition  is  met: 


(2.7) 


where  [j  -  jj  is  the  /2  norm,  e2>0,  and  ar/+1  is  the  *th  unordered  point  in  the 
£+lst  simplex. 


Stopping  criteria  (2.6)  and  (2.7)  are  very  different.  The  algorithm  is 
halted  in  (2.6)  based  on  function  value  information,  while  (2.7)  uses  vertex 
information.  Certain  problems  can  arise  with  stopping  criterion  (2.6).  For 
example,  if  the  function  values  are  very  close,  then  the  algorithm  halts 
regardless  of  the  size  of  the  simplex.  That  is,  the  algorithm  may  halt  when 
the  simplex  is  very  large.  For  an  example  of  this  and  additional  difficulties 
with  stopping  criterion  (2.6),  see  Woods  [7]. 

Objections  to  using  (2.7)  as  the  stopping  criterion  may  also  be  raised. 
The  main  objection  to  (2.7)  is  that  the  left-hand  side  of  (2.7)  for  a  shrinkage 
step  will  be  greater  than  the  value  for  a  contraction  step,  and  we  have 
observed  that  shrinkage  occurs  frequently  when  the  simplex  is  in  a 
neighborhood  of  a  local  minimizer.  Woods  [7]  introduces  the  stopping 
criterion 


x  o ,  !ix*  xi SS  <  e3>  (2.8) 

ZA  2<i<n+l 

where  A  =  max(l,  j  J  x:  Jj )  and  e3>0.  This  is  a  measure  of  the  relative  size  of 
the  simplex.  Preliminary  testing  of  (2.8)  has  indicated  that  it  is  a  useful 
stopping  criterion  for  the  Nelder-Mead  algorithm. 


3.  Convergence  Properties.  Although  this  algorithm  is  used 
extensively,  the  convergence  theory  is  not  well-developed.  The  only 
convergence  results  of  which  we  are  aware  appear  in  Woods  [7],  for  a 
slightly  modified  version  of  the  algorithm,  and  in  the  forthcoming  paper  of 
Dennis  and  Woods  [2],  for  the  algorithm  as  stated  here.  The  result  of 
Dennis  and  Woods  states  that  if  the  algorithm  is  applied  to  a  strictly 
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convex  function  and  the  level  set  of  the  function  corresponding  to  the  value 
at  the  worst  vertex  of  the  initial  simplex  is  bounded,  then  the  algorithm  will 
converge  to  a  connected  set  of  points,  all  of  which  have  the  same  function 
value.  Additionally,  each  convergent  subsequence  of  the  sequence  of 
simplices  generated  by  the  algorithm  converges  to  a  totally  degenerate 
simplex,  i.e.,  a  single  point. 

Unfortunately,  the  convergence  theory  does  not  provide  the  desired 
result  of  convergence  to  the  minimizer  of  the  strictly  convex  function.  In 
fact,  Dennis  and  Woods  show  that  this  is  not  necessarily  true  under  their 
assumptions.  They  show  this  by  the  following  example: 

EXAMPLE  3.1  :  Let  c1=(0,32)r,  c2  =  (0,—  32)r,  and  consider  the 

strictly  convex  function  f  (x)  =  —  max{jj:r  — c1  jj2,  [jx— c2j|2}.  The  level 

sets  of  this  function  are  displayed  in  Figure  3.1  as  is  the  initial  simplex, 
S0  —  <xlf  x2,  x3>  =  <(8,  0)T,  (—8,  — 4)r,  (—16,  10)r>.  It  should  be 

obvious  from  the  figure  that  both  the  reflected  and  contracted  vertices  are 
rejected  at  this  iteration  and  the  simplex  is  shrunk.  If  the  simplex  were  to 
shrink  at  every  iteration,  then  the  sequence  of  simplices  would  converge  to 
the  totally  degenerate  simplex  S  =  <Xj,  x1}  xx>,  which  is  not  a  local 
minimizer. 

Dennis  and  Woods  show  that  the  algorithm  can  be  made  to  converge  to 
any  point  (a,0)T  depending  upon  the  choice  of  the  initial  simplex.  When 
a^=0,  the  algorithm  does  not  converge  to  the  minimizer  which  is  (0,0) T . 


Figure  3.1 


4.  Conclusions.  The  Nelder-Mead  simplex  algorithm  is  very  well-suited 
for  use  on  microcomputers.  It  is  robust,  easy  to  program  and  requires  very 
little  storage  and  information  for  execution.  Although  convergence 
properties  for  the  algorithm  are  not  well  understood,  the  algorithm  is  used 
in  many  applications. 
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Appendix.  We  now  present  an  algorithmic  statement  of  the  Nelder-Mead 
simplex  method.  For  simplicity,  we  have  introduced  the  temporary 
variables  xk  and  x{ .  In  an  implementation  of  the  algorithm  pointers  to  the 
corresponding  vertices  would  be  used.  Various  stopping  criteria  for  the 
algorithm  are  discussed  in  Section  2. 

Algorithm  A-l:  Nelder-Mead  Simplex  Algorithm 

Given  S0  with  vertices  <x1?  x2,  •  •  •  ,  xn+1>,  set  a= 1,  7=2. 

For  k  =  1,  2,  •  •  • 

set  x,-  =  x,-,  *=1,  •  •  •  ,  n 

-  1  n 

compute  x  —  —  Vx.- 
n  i-l 

compute  xr  =  (l+o;) x  —  a;xn+1 
X  =  xr 

if  (f{xr)  </(xn))  then 

if  UM<f(xi))  then 

compute  xe  =  ^xr  +  (1-7)  x 

UM  <  f(xi))  then  xk  =  xe 
else  set  xl  =  xn+1 

if  ( f(xr )  <  /(xf))  then  x t  =  xr 
compute  xc  =  /3xt  +  (1—0)  x 

(/  (*c)  <  /  (xn))  then  xk  =  xc 

Xi+X- 

else  x,-  =  -  for  3  =  2,  •  •  •  ,  n 

3  2 

k  X!+xn+1 

X  =  - 

2 

Check  the  stopping  criterion. 

Sort  /(xj),  /(x2),  •  •  •  ,  f(xn),  f(xk)  to 

obtain  Sk  =  <xx,  x2,  •  •  •  ,  x„+1>. 
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